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A Feed Forward Error Back Propagation Artificial Neural Network (ANN) algorithm is developed 
for electron/positron identification in a wide momentum region (10 - 300 GeV/c). The method was 
proposed for the Transition Radiation Detector of the E781 experiment at Fermilab. The package 
consists of two parts: 
' • the program for the ANN training; 

^ |. • the particle classification subroutine. 

Both parts are built using the object oriented technique and C++ language. The particle identi- 
fication algorithm is wrapped in FORTRAN closers to be used in the E781 off-line program. The 
package performance was tested in comparison with the likelihood ratio method using Monte Carlo 
O ' generated data. Our study has demonstrated the excellent ability of the ANN to learn even small 

^ , details of the detector response function. The ANN solution gives the same performance and be- 

C/3 ■ havior as the likelihood method when using Monte Carlo data with known detector parameters. It 

^ ^ ' demonstrates that if trained with experimental data the package can provide a very good solution 

C/3 , to the classification problem of e^/e~ tracks. 

>>■ 

INTRODUCTION 



In this work we have studied the ability of an Artificial Neural Network (ANN) based method applied to a Transition 
Radiation Detector (TRD) to select e+/e^ tracks in a high energy fixed target experiment. 
■ In hadroproduction fixed target experiments, such as the E781 experiment at Fermilab the primary interaction 
" produces many tracks in a narrow sohd angle around the beam direction. Most of these particles will not be of leptonic 
y—^ nature. Nevertheless they can be responsible directly or indirectly (if for instance they suffer secondary interactions) 

' for the appearance of ionization clusters in the TRD which can fake the e^ /e~ signal under investigation. 
00 ! We have used GE781 §1, the GEANT ^ based Monte Carlo simulation of the E781 apparatus in our studies. The 
• TRD model we are working with is made of 6 blocks, similar to the ones described in Ref. [Q, each one composed by 
. a radiator (220 foils of CH2) and a Xe-CIl4 filled proportional chamber with the ability to measure the x-coordinate 
. ^ of the track. The detector operates in the cluster counting mode Ionization clusters were counted only along each 
^ track direction to reduce the influence of the background. 

The momentum region of the tracks under investigation is from 10 to 300 GeV/c, which corresponds to the growing 
part of the Transition Radiation yield curve for pions Above this momentum it is more difficult to distinguish 
pions from electrons using this detector. 

To achieve appropriate background suppression as well as good signal efficiency we have exploited the 7 = E /m 
(where E is the particle energy and m is its mass) factor dependence of the detector response in our algorithm. 
; ^ ■ In the following we describe and discuss the maximum likelihood ratio technique, our ANN approach and compare 
' their performance with Monte Carlo simulated data. 
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DESCRIPTION OF THE TRD LIKELIHOOD METHOD 



This method was proposed for TRD detectors by M. L. Cherry et al. Q and was demonstrated to give better 
performance than traditional methods in Ref. |@-^. 

The hkehhood function is buih in a way to classify particles into two categories : e^/e^ (type 1) and others, 
mostly pions, (type 2). So if a particle of type i = 1,2 with Lorenz factor 7^ generates a sample X~{xi,X2, xq} of 
transition radiation clusters along its track we can define the probability P{X\i,"fi) of this event as 

6 

P{X\i,j,) = l[P{xk\j^). (1) 

fe=i 

The probability density functions P{xk\^i) were calculated using the detector response function. 
We can define the likelihood ratio for particle of type i as 

which is restricted to the interval < Ci < 1, and use this ratio as an indicator of the particle type. For each track 
one can calculate the above ratio in two possible hypothesis. We expect that this ratio will be closer to 1 whenever 
the hypothesis is correct and closer to zero whenever it is wrong. 



DESCRIPTION OF THE TRD NEURAL NETWORK 



In order to achieve the maximal performance in the electron identification with TRD we have developed an algorithm 
using a Feed Forward Error Back Propagation Artificial Neural Network . A similar application was developed 
in Ref. ||T^ for 10 modules of TRD used for cosmic ray lepton identification. 

As we have 6 modules of TRD we will use 6 input nodes with linear response function that will receive the cluster 
sum along the track in each TRD block normalized to unity. To explicitly take into account the 7 dependence of the 
detector response and consequently increase the momentum region in which the algorithm can provide an efficient 
classification of tracks, an extra linear node was introduced. This node was fed with a normalized to one 7 factor 
calculated in the pion hypothesis, i.e. 

E 

node 7 activation = , (3) 

"i^7c„t 

where E is the energy of the particle, the mass of the pion and 7„„t was chosen to be 2200, which corresponds to 
pions of about 310 GeV/c. For 7 greater than 7cut the node 7 activation is set equal to one. 

We start with 7 input nodes and would like to have a similar classification for particles here as in the previous 
method, that is, two output nodes. So according to Kolmogorov theorem [|l^ we can approximate our classification 
function with 15 nodes in the hidden layer. This defines the structure of the network as 7+15+2 nodes, as presented 
in Fig. I 




Sigmoid nodes 



FIG. 1. The TRD neural network structure. 
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The first layer of nodes are fully connected to the second layer of 15 nodes with a sigmoid response function which 
is followed by the output layer of 2 nodes (res 1 and res 2). Each neuron of one layer receives as input the outputs 
of all neurons from the previous layer with weights defined by the synaptic matrix W. The activation level of the 
output nodes will provide the track classification. 

The sigmoid function we have used is defined as following: 

^(^' = 7^-7 rTT> (4) 

^ ' 1 + exp(-a(a: - &)) 

where h is the neuron threshold, and a is a gain. It is clear that the responses of the two output nodes are bound to 
the interval [0, 1] and provide a similar classification scheme as in the previous method. 

The training process was performed with the standard back-propagation technique | [lO| | during which the corrections 
to the synaptic matrix elements Wik were calculated according to the rule : 

AW^ifc(z) = -5^^jp + AfAM^zfc(z-l), (5) 

OWik 

where /S.Wik{i) is the correction to the synaptic matrix elements Wik after i steps, E[W] is the summed square error 
function, S is the learning rate parameter and M/^Wik{i — 1) is the momentum term used to avoid sudden oscillations. 
We have used in our implementation S = 0.1 and M = 0.3. 
The method consists of two independent parts: 

• the training program; 

• the particle classification subroutine. 

Both parts were written in C++ using the Object Oriented (00) approach. As the basis for our algorithms we 
have used modified classes provided by Robert Klapper p3[ |. The training procedure can use either Monte Carlo or 
experimental data. The result of the training program is a file describing the ANN structure and parameters. This 
file is used then by the particle identification algorithm in the initialization stage, this permits to dynamically update 
the ANN parameters. The particle identification algorithm is wrapped in FORTRAN closers to be used in the E781 
off-line program. 

To train our network we have used Monte Carlo simulated data with plane momentum distribution in the whole 
momentum region. For this the detector response function was studied and implemented in the Monte Carlo. It is in 
principle also possible to use experimental data for this purpose, but in this case we need some independent tagging 
of electrons to prepare pure samples of particles with high statistics and in a wide momentum region. 



COMPARISON OF THE METHODS 



To compare the performance of these two methods we have used single tracks generated in the whole momentum 
region and Monte Carlo simulated hadronic interactions enriched with leptonic processes. In the last case we have 
exactly the same momentum distribution of particles as expected in the experiment. 

To choose a suitable efficiency to contamination ratio working point we can apply correlated cuts in the feature 
space of Ce (in the /e~ hypothesis) and Ct (in the other particles hypothesis) for the likelihood method and in the 
feature space of res 1 and res 2 for the artificial neural network. This can be done in the following way: 

Ce > cut and Ch < 1. — cut, (6) 

for the likelihood ratio case and 

res 1 > cut and res 2 < 1. — cut, (7) 

for the network, where cut is any real value from zero to one. Changing the value of the cut we can build plots of the 
hadronic contamination as a function of electron detection efficiency for these two methods. 
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Single tracks simulation 



Artificial Neural Network 
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FIG. 2. Efficiency and contamination for tlie two algorithms as a function of particle momenta. 

We will start our comparison of the methods acceptance and rejection power using single tracks generated in a 
wide momentum range from 10 to 300 GeV/c. The momentum region was subdivided in bins of 30 GeV/c, and the 
corresponding parameters were calculated for each bin. To compare the methods we have selected cuts in such a way 
to have approximately equal efficiencies for both methods in the first momentum bin. 

As one can see in Fig. |^ both methods show similar behavior and the revealed growth of the contamination and 
decrease of the efficiency corresponds to the degradation with momentum growth of the classification power of the 
detector itself . 
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FIG. 3. Electron identification efficiency (te) versus hadronic contamination (e,r), taken from simulation of hadroproduction 
data for tlie two metliods under study. 

Simulation of the hadronic interaction data 

To compare the two methods in experimental conditions, including hadronic background, secondary interactions and 
real momentum distribution we decided to use Monte Carlo generated hadronic interactions with complete simulation 
of the detector by the GE781 package. The hadronic background, as well as the electromagnetic one, simulated by 
the package is supposed to be very close to the experimental one. 

The e^/e" efficiency versus the contamination by all other particles for such sample is shown in Fig. ^. One can 
see that the two methods are indistinguishable. We really can not give preference to any of these two methods from 
the performance point of view. 

CONCLUSIONS 

We have studied the performance of the ANN solution to particle classification with a TRD which directly use the 
momentum information of the tracks. 

The ANN solution gives the same performance and behavior as the likelihood method when using Monte Carlo data 
with known detector parameters. It demonstrates that if trained with experimental data the package can provide a 
very good solution if not the best one as the ANN can learn unknown properties of the detector and of the experimental 
conditions that can not be implemented in the likelihood method. 
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